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Max-stable processes are the natural extension of the classical 
extreme-value distributions to the functional setting, and they are in¬ 
creasingly widely used to estimate probabilities of complex extreme 
events. In this paper we broaden them from the usual situation in 
which dependence varies according to functions of Euclidean distance 
to situations in which extreme river discharges at two locations on 
a river network may be dependent because the locations are flow- 
connected or because of common meteorological events. In the for¬ 
mer case dependence depends on river distance, and in the second it 
depends on the hydrological distance between the locations, either of 
which may be very different from their Euclidean distance. Inference 
for the model parameters is performed using a multivariate threshold 
likelihood, which is shown by simulation to work well. The ideas are 
illustrated with data from the upper Danube basin. 

1. Introduction. Modeling extreme events has recently become of great 
interest. The financial crisis, heat waves, storms and heavy precipitation 
underline the importance of assessing rare phenomena when few relevant 
data are available. 

There is a vast literature on modeling the univariate upper tail of the 
distribution of environmental quantities such as precipitation or river dis¬ 
charges at a fixed location t. If Xi(t) {i = 1,... ,n) are n S N independent 
measurements of a random spatial process X at location t, then the prob¬ 
ability law of the maximum of the n observations can be approximated by 
the generalized extreme value distribution (GEVD) 

(1) p| max ^ < xl « G(x) = exp{ —(1-b x G M, 

[i=l,...,n On J 
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where z+ = max( 2 ;, 0 ) and G M, On > 0 and ^ G M are the location, scale 
and shape parameters, respectively. For ^ = 0, G(x} is read as the limit 
exp{ —exp(—x)}. In fact, (1) represents the only possible nondegenerate 
limit for maxima of independent and identically distributed sequences of 
random variables [see, e.g.. Coles (2001), Chapter 3]. This justifies the ex¬ 
trapolation to high quantiles using the parametric tail approximation (1) for 
u close to the upper endpoint of the distribution of X{t) by 

(2) p{x(t)>n}«-(l + e^^) 

n V On /+ 

Often, however, univariate considerations are insufficient, because near- 
simultaneous extreme events may cause the most severe damage. In con¬ 
sidering flooding of a river basin, for example, it is crucial to understand 
the extremal dependence between flows at different gauging stations. Many 
authors have analyzed this using multivariate copulas or multivariate ex¬ 
treme value distributions [e.g., Salvadori and De Michele (2010), Renard 
and Lang (2007)], but the explosion of the number of parameters in high 
dimensions limits the applicability of such models, and information on the 
geographical location of the stations cannot be readily incorporated. Mete¬ 
orological considerations suggest that extremal dependence can be modeled 
as a function of the distance between two locations. Indeed, for precipita¬ 
tion, temperature or wind data, the use of Euclidean distance has become 
standard in spatial extremes [e.g., Davison and Gholamrezaee (2012), Huser 
and Davison (2014), Engelke et al. (2015)]. An important class of proba¬ 
bility models for extreme spatial dependence on the Euclidean space is 
the class of max-stable processes, giving several flexible models whose depen¬ 
dence is parameterized in terms of covariance functions [Schlather (2002), 
Opitz (2013)] or of negative dehnite kernels [Brown and Resnick (1977), 
Kabluchko, Schlather and de Haan (2009), Kabluchko (2011)]. Almost all 
such models have hitherto presupposed that extremal dependence depends 
only on the Euclidean distance between two locations, but this may be too 
restrictive when more is known about the physical processes underlying the 
data: locations on a river network may interact because of the flow of water 
downstream between them. 

In this paper we focus on assessment of the risk of extreme discharges on 
river networks in order to understand and prevent flooding. There is long¬ 
standing interest in the application of extreme value statistics in hydrology 
[e.g., Katz, Parlange and Naveau (2002), Keef, Svensson and Tawn (2009), 
Keef, Tawn and Svensson (2009)]. In Europe, floods are major natural haz¬ 
ards that can end human lives and cause huge material damage. Figure 1 
shows the upper Danube basin, which covers most of the German state of 
Bavaria and parts of Baden-Wiirtemberg, Austria and Switzerland, and is 
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Fig. 1. Topographic map of the upper Danube basin, showing sites of 31 gauging stations 
(black circles) along the Danube and its tributaries. Water flows toward gauging station 1. 

regularly affected by flooding. For this reason there is a well-developed sys¬ 
tem of gauging stations that measure the daily average river discharge on 
this river network; the locations of 31 stations are shown on the map. For 
each fixed location tj (j = 1,..., 31) on the network, the approximation (2) 
can be applied to daily measurements Xi{tj) (i = 1,..., n) of river discharge 
(m^/s) in order to model univariate tail probabilities. 

Dependence modeling is more challenging. The extremal coefficient 
9{ti,tj) G [1,2] measures the degree of dependence of large values at two 
locations and tj on the river network; it ranges from 6{ti,tj) = 1 for com¬ 
plete dependence to 6{ti,tj) = 2 for independence. The left panel of Figure 2 
shows its values for all pairs of stations in Figure 1, plotted against their 
Euclidean distances. Unlike similar plots for extreme precipitation, the non- 
Euclidean structure of the network means that this graph shows only a weak 
relationship. 

In this paper we aim to exploit both the geographical structure of the 
river basin and the hydrological properties of the network in order to provide 
a parsimonious model for extremal dependence. The resulting dependence 
function has two parts: 

• since precipitation is the major source of extreme river discharges and it 
is spatially dependent, one also expects higher dependence of river dis¬ 
charges at stations which are close. The left panel of Figure 2 suggests 
that the Euclidean distance between stations has low explanatory power, 
so we shift each gauging station to a new position in the center of its 
sub-catchment, which we call its hydrological position. The extremal coef¬ 
ficients plotted against the hydrological distance between the hydrological 
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Eig. 2. Extremal coefficients (estimated using the madogram) of all pairs of gauging 
stations plotted against Euclidean distance (left) and hydrological distance (right); those 
for flow-conneeted pairs are blue crosses, and those for flow-unconneeted pairs are black 
circles. 

positions exhibit a strong functional relationship, shown in the right panel 
of Figure 2, which is exploited in the dependence model described in Sec¬ 
tion 3.3; 

• the crosses in Figure 2 represent the extremal coefficients of pairs of ffow- 
connected stations, which have one station located upstream of the other. 
Such pairs are generally more dependent than flow-unconnected pairs, 
not only because the catchments are close but also owing to the flow of 
water along the river. In Section 3.2 we explain how knowledge about the 
network structure and river sizes can be included in the dependence model 
for flooding using ideas of Ver Hoef and Peterson (2010), who defined 
covariance functions on river networks. 

As one application of such a model, we would like to be able to compute 
the multivariate counterpart of (2), that is, the probability of a rare event 
such as 

P{X(si) >ui,...,X{sk) >Uk} 

for large ui,... ,Uk >0, where si,..., G T can be any stations on the river 
network, even without measurements there. More complicated quantities, 
such as the sum of discharges at several stations, may also be of interest. 

2. Preliminaries. 

2.1. Extreme value theory. The only nontrivial limiting distribution for 
the normalized maxima of an independent and identically distributed se¬ 
quence of scalar random variables is the max-stable GEVD, expression (1). 
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In the multivariate case, we can transform each margin such that the max- 
limit has a standard Frechet cumulative distribution function exp(—1/x) 
(a; > 0). In this way, without loss of generality, we can concentrate on the 
multivariate dependence between the components [Resnick (1987), Proposi¬ 
tion 5.8]. 

Let Xj = {Xi^i ,..., (i = 1,..., n) be independent copies of an m- 

variate random vector X and assume that for each j = 1,..., m the maximum 
maxjXj^j converges to a GEVD Gj, as in (1), with norming constants bj^n £ 
1^) 0'j,n > 0 and shape parameter Define the transformations 

(3) Uj{x) = -l/logGj{x) = + 

and note that 

lim max Uj(^^ —< xl = exp(—1/x), j = l,...,m. 

n^oo y aj^n J J 

We say that X is in the multivariate maximum domain of attraction 
(MDA) of a random vector Z = (Zi, ..., Zm), if for any z = (zi,..., Zm), 

lim P 

n^oo 

( 4 ) 

call this joint distribution Fz{z). In this case, Z is max-stable with standard 
Frechet marginal distributions; see before (9). Moreover, by Resnick (1987), 
Proposition 5.8, we may write 


max Ui 


Xi^i — bi^ 
^l.n 


< ,..., max Um 






P(Z<z); 


(5) Fz(z) =exp{-I/(z)}, zGM™, 


where the exponent measure R is a measure defined on the cone E = 
[ 0 , 00 )™ \{0} and R(z) is shorthand for R([0,z]^). The object V incorpo¬ 
rates the extremal dependence structure of Z, where R(z) = 1/min(zi,..., 
Zm) and R(z) = 1/21 H-h 1/zm represent complete dependence and inde¬ 

pendence, respectively. The measure V is homogeneous of order —1, that is, 
V (Az) = (z), for A > 0, and it satishes V {z, 00 ,..., 00 ) = l/z for z > 0 

and any permutation of its arguments. There are many parametric models 
for the exponent measure V and thus for multivariate extreme value distri¬ 
butions or copulas. The explosion of parameters in most such models makes 
fitting them feasible only in low dimensions. 

By Proposition 5.17 of Resnick (1987) the convergence in (4) is equivalent 
to 


(6) lim nP 

n^oo 




Xi - 61,, 

0-1,n 




Xm, bn 


eA 


V(A) 
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for any Borel subset A <Z E which is bounded away from 0 and satisfies 
V{dA) =0, where dA is the boundary of A. This important observation 
allows us to approximate the probability that X falls into a rare region. For 
instance, \i A = (ui, oo) x • • • x {um, oo) (ui,..., Um G IK), then for large n 
(6) implies that 


(7) 


'^j ^j,n 




, OO 


where H denotes the Cartesian product. More complicated events such as 
A = {x G : X)™ 1 Xi > u} for some u G M can also be considered. Equation 
(6) implies that as n —)• oo the empirical point process 


U^ 


XlA - I 


U,n 


, . . . , U„ 


Xm,i ^m,n 


: i = 1,..., n 


®l,n / \ '^m,n 

converges vaguely to a Poisson point process on E with intensity measure 
V [Resnick (1987), Proposition 3.21]. In Section 4 this result will be used 
to derive the asymptotic distribution of exceedances and to fit parametric 
models for V. 

In the bivariate case m = 2, a common summary statistic for the depen¬ 
dence among components of Fz is the extremal coefficient 0 G [1,2] [see, e.g., 
Schlather and Tawn (2003)], which is defined through the expression 

(8) ¥{Zi <u, Z 2 <u) = F{Zi <u)^, u>0, 

or, equivalently, 6 = V{\,1)- Consequently, the cases 9 = 1 and 9 = 2 corre¬ 
spond to complete dependence and independence. Model-free estimation of 
the extremal coefficient is possible through the madogram [Cooley, Naveau 
and Poncet (2006)], and these estimates of 9 can be used for model-checking. 

2.2. Max-stable processes. Max-stable processes can be defined on any 
index set T, though this is usually taken to be a subset of an Euclidean 
space A random process {Z{t) ; t G T} is called max-stable if there exists 
a sequence (Xj)jgi!^ of independent copies of a process {X{t) : t G T} and 
functions On(t) > 0, bn{t) G M, such that the convergence 


(9) 


Z{t)= lim I max Xi(t) - bn{t)\/an{t), 


tGT, 


holds in the sense of finite dimensional distributions. In this case, the process 
X is said to lie in the max-domain of attraction of Z. 

The class of max-stable processes is generally too large for statistical 
modeling, so one typically considers parametric subclasses of models. Ex¬ 
amples include mixed moving maxima processes [Wang and Stoev (2010)], 
Schlather processes [Schlather (2002)] and Brown-Resnick processes [Brown 





EXTREMES ON RIVER NETWORKS 


7 


and Resnick (1977), Kabluchko, Schlather and de Haan (2009)]. In this pa¬ 
per we rely on the construction principle for a large class of max-stable pro¬ 
cesses given in Kabluchko (2011); see also Kabluchko, Schlather and de Haan 
(2009). A negative definite kernel V on an arbitrary nonempty set T is a 
mapping F : T x T —)• [0, oo) such that for any n G N and oi,..., G M with 
Sr=i ~ have 

n n 

< 0, ti,...,tn&T. 

i=i j=i 

The following result states that there corresponds a max-stable process to 
any negative definite kernel on T. 

Theorem 2.1 [Kabluchko (2011), Theorem 1]. Suppose that Wi (i&N) 
are independent copies of the zero-mean Gaussian process {W{t) : t G T} 
whose incremental variance E{lK(s) — W{t)}‘^ equals T{s,t) for all s,t gT. 
Let (T^(t) =E{iy(t)^} denote the variance function ofW and let {Ui : i G N} 
denote a Poisson process on (0,oo) with intensity u~‘^du. Then the process 

(10) r/r(t) = maxt/iexp{Wj(t) — (T^(t)/2}, t G T, 

ieN 

is max-stable, has standard Frechet margins, and its distribution depends 
only on F. 

If T = and W is an intrinsically stationary Gaussian process, then r/r 
is called a Brown-Resnick process [Brown and Resnick (1977), Kabluchko, 
Schlather and de Haan (2009)]. This is a popular model for complex ex¬ 
treme events. The generation of random samples from Brown-Resnick type 
processes is challenging [cf. Engelke, Kabluchko and Schlather (2011), Oest- 
ing, Kabluchko and Schlather (2012)], but recent advances provide exact 
and efficient algorithms [Dieker and Mikosch (2015), Dombry, Engelke and 
Oesting (2016)]. 

Remark 2.2. (a) For any negative definite kernel F there are many dif¬ 

ferent Gaussian processes with incremental variance F [Kabluchko (2011), 
Remark 1]. In particular, for u G T, we can choose a unique Gaussian pro¬ 
cess with incremental variance F and (u) = 0 almost surely. The 

covariance function of this process is 

(11) E{lF(“)(t)H/(“)(s)} = {T{s,u)+T{t,u) - T{s,t)}/2. 

Thus, there is a one-to-one correspondence between negative definite kernels 
F and the class of max-stable processes rjr- 

(b) If {A"(t) : t G r} is a zero-mean Gaussian process with covariance 
function (7 : T x T —>• M, then F(s, t) = C{s, s)-\-C{t, t) — 2C{s,t) is a negative 
dehnite kernel on T. 
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The bivariate distribution function of {r]r{s),rjr{t)) {s,t G T) is 
F{r/r(s) <x,r]rit)<y} 

log(x^] 

2 x/f(M)J/’ 

x,y>0, 


(12) = exp<-<!' 

X 


vr(s,t) log(y/x) 




_ 

y 


where is the standard normal distribution function. Analogously to the 
extremal coefficient in (8), one considers the extremal coefficient function 
0{s,t) {s,t G T), defined as the extremal coefficient of the bivariate vector 
{rjr{s),r]rii)), as a measure of the functional extremal dependence of the 
max-stable process yr- By (12), we conclude that 

(13) 0(,,i) = 2cl>|v5M|, 


so the negative definite kernel T parameterizes the extremal dependence 
between observations at positions s and t; small and large values of r(s,t) 
correspond to strong and weak dependence, respectively. By Remark 2.2(a), 
any kernel T yields a max-stable process r/r, so in Section 3 we can and will 
focus on Ending a parametric model for T suitable for our application. 

The higher dimensional distributions of r/r are more complicated. For 
instance, for t = (ti,..., tm) G T™', the random vector {rjriti), ■ ■ ■, Vritm)) is 
max-stable and its exponent measure lA,t defined in (5) is characterized by 
[Kabluchko (2011)] 


(14) 


Vr,t{xi,...,Xm) =E 


max 


W{ti) - a^{ti)l2 


Xi 


This multivariate max-stable distribution is called the Hiisler-Reiss distri¬ 
bution [Hiisler and Reiss (1989)]. Computation of the expected value in (14) 
involves high-dimensional integrals and thus is awkward in general. 


3. Model. 


3.1. River network. In the previous section we showed how to define 
max-stable processes on an arbitrary index set T. From here on, T will 
represent a river network and we will construct a kernel F flexible enough 
to explain the extremal dependence observed in data. 

Let us first hx some notation for river networks [Ver Hoef and Peterson 
(2010)]. We embed our network T in the Euclidean space representing 
the geographical river basin. To this end, let T C denote the collection 
of piecewise differentiable curves, called river segments, that are connected 
at the junctions of the river and whose union constitutes the river network. 
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River Distance 


Fig. 3. River network with three locations ti,t 2 ,ts £ T; t\ is flow-connected with both t 2 
and tz, but t 2 and tz are flow-unconnected. 

There is a finite number M S N of such segments and we index them by 
i ^ S = {1,M}. The network is dendritic, in the sense that there is one 
most downstream segment, which splits up into other segments when going 
upstream; see Figure 3. For a location G T on the zth segment, we let 
C 5 denote the index set of river segments downstream of ti, including 
the fth segment. Moreover, for another location tj G T on the jth segment we 
say that ti and tj are flow-connected, written GG tj, if and only if Di C Dj 
or Dj ^ Di. If ti and tj are not flow-connected, we say that they are flow- 
uneonneeted and write ti ^ tj. If tj is upstream of ti, that is, Di C Dj, then 
we denote the set of segments between tj and ti, inclusive of the jth but 
exclusive of the ith segment, by Bij = Dj \Di. If tj is downstream of ti, 
then Bij = Di\ Dj. In the case that ti and tj are on the same segment, that 
is, Di = Dj, we put Bij = 0. 

We define the river distanee d{ti,t 2 ) between two arbitrary points ti,t 2 
on the network T as the shortest distance along T, that is, we sum the 
arc-lengths of the segment curves lying between ti and t 2 ; see Figure 3. 
The embedding of the river network T in the Euclidean space has the 
advantage that we can exploit the geographical structure of the river basin. 
To this end, associate to each location t = (x,y) G T C the set St C 
of all points on the geographical map such that water from this point will 
eventually flow through point t on the river. The set St is called the sub¬ 
catchment of location t; see Figure 4. 

As explained in Section 2.2, we need to construct a negative definite ker¬ 
nel F on the space T xT that captures the dependence structure of extreme 
values on the river network T. Figure 2 suggests that this should be based 
on two components: one, Friv, for the flow-connected dependence along the 
river, taking into account the hydrological properties of the river network; 
and another, Feuo for the dependence resulting from the geographical struc¬ 
ture of the river basin and spatially distributed meteorological variables. 

3.2. Dependence measure Friv There are many models for Gaussian 
random fields where the covariance between two locations depends only on 
the Euclidean distance between two points. Such covariances are not valid 
with metrics such as the river distance d on our network because they may 
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Fig. 4. Gauging stations 5 and 23 (black circles), their sub-catchments in light green and 
blue, respectively, and their hydrological locations (black triangles) as defined in (17). 

not be positive definite. Recent work [Ver Hoef, Peterson and Theobald 
(2006), Cressie et al. (2006), Ver Hoef and Peterson (2010)] has developed 
covariances that are positive definite as functions of river distance. A re¬ 
lated approach, the top-kriging of Skpien, Merz and Bloschl (2006), uses 
variograms integrated over catchments, but does not provide closed-form 
formulae, so we focus on river distance methods. 

Following the “upstream construction” in Ver Hoef, Peterson and Theobald 
(2006), we can define a covariance function based on river distance for 
ti,tj G T by 

keBij 

0 , 

where the covariance function Ci arises from a moving average construction 
on M. If Bij = 0 in (15), then OfceBi ■ is set to 1. The corresponding 
weights vTfc (k G Bij) are chosen such that the variance is constant, that is, 
CRiv{ti,ti) = C^n{tj,tj) = (71(0) for all ti,tj G T. For a fuller treatment, see 
Ver Hoef, Peterson and Theobald (2006) and Ver Hoef and Peterson (2010), 
who also provide different parametric classes for the covariance function Ci , 
including the linear with sill model 

C'i(/i) = (1 -/i/r)+, T>0, 

which we use below. Intuitively, the covariance function (15) can be under¬ 
stood as follows: an event at a downstream location, for example, H in Fig¬ 
ure 3, can be caused by an event on one of the two branches of an upstream 
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bifurcation. The weights vr^ quantify the proportions of events coming from 
the branches. If several bifurcations lie between two flow-connected loca¬ 
tions, then the weights along the connection must be multiplied. The choice 
of the weights in the covariance function Criv in (15) is crucial and depends 
on the application. As we consider extreme discharges on river networks, 
the weights at a bifurcation should reflect the proportion of large discharge 
values at the downstream river that are caused by a large discharge of one 
of the upstream rivers. In Figure 3, for example, a natural choice for the 
weights 7r2, vra on the river segments of t 2 , is to take the proportion of 
mean water volumes, that is, vTj = Et^/{Et 2 + Et^), where Et^ is the aver¬ 
age discharge at location ti {i = 2,3). This, however, requires measurements 
at all bifurcations. Since we would like to use our model for extrapolation 
to parts of the network without measurements, we must approximate Et-^ 
and Et 2 - A digital elevation model can be used to extract the geographical 
coordinates of the sub-catchment St corresponding to each location t gT 
on the river network, including the altitude h{x,y) at all (x,y) G St- Ex¬ 
ploratory analysis shows that altitude is an excellent covariate for average 
precipitation, so we define E* as the integrated altitude over St, that is, 



h{x, y) dx dy, 


which is thus approximately proportional to the average runoff accumulated 
in the sub-catchment St- We then dehne the weights in the above example 
to be 


(16) ^, = El/{El^+El), z = 2,3. 

By the second part of Remark 2.2 and the construction of the positive def¬ 
inite covariance function in (15), we obtain a negative definite kernel Triv 
on the river network T by setting 


rRiv(ti) tjr 


keBij 

1, tt t j . 


3.3. Dependence measure Truc- Two flow-unconnected locations on the 
river network can have dependent extreme discharges, since precipitation 
is spatially dependent. As shown in Figure 2, the usual Euclidean distance 
between two points cannot fully explain this dependence, because the to¬ 
tal amount of water at location t gT on the river network comes not only 
from precipitation there, but also from the accumulated runoff from its sub¬ 
catchment St- Thus, instead of the Euclidean distance between two points 
s,t gT, we should consider a hydrological distance that appropriately de¬ 
scribes the distance between runoff in sub-catchments Ss and St due to 
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precipitation. For this purpose we first shift each location t € T to a hydro- 
logical location by a function : T —)• . In our case, the center of mass 
of mean annual precipitation on the sub-catchment St gives a good choice 
[Merz and Bloschl (2005)]. As noted in Section 3.2, precipitation data on 
a dense grid is often difficult to obtain, so we use the altitude h{x,y) at 
location (x, y) G St instead. 

The hydrological location H{t), or “altitude weighted centroid,” of a point 
on the river network is 

(17) H{t) = xh{x,y)dxdy,^ yh{x,y)dxdy'^ , t G T, 

and the hydrological distance between s,t gT is ||i?(s) — H{t)\\, where || • || 
denotes Euclidean distance. Figure 4 shows two stations on the river net¬ 
work that are close in terms of Euclidean distance but whose hydrological 
locations are far apart. The right-hand panel of Figure 2 reveals strong func¬ 
tional dependence of the extremal coefficients on hydrological distance. 

A variogram that is valid on the Euclidean space can be applied 
to the hydrological positions H(t) (t G T). The fractal variogram family 
^a{x,y) = ||x — y||“ {x,y G M^), where a G (0,2] is called the shape param¬ 
eter, is commonly used, but it is isotropic: the dependence decreases at 
the same rate in each direction. Extremal meteorological data often exhibit 
anisotropies that can be captured by including a rotation and dilation matrix 
[Blanchet and Davison (2011), Engelke et al. (2015)] 

(18) R = R{p,c)=(^^^^ /3g [7r/4,37r/4],c> 0, 

\csmp ccosp / 

where the restriction of (5 to one quadrant ensures the identifiability of the 
parameters (/3,c). Applying the kernel Fq, and transformation R to the po¬ 
sitions H{t), we obtain a negative definite kernel on the river network T, 
that is, 

rEuc(ii,ii) = \\R-H{ti) - i? • f/'(tj)||“, ti,tj gT, 
where R ■ v denotes matrix multiplication of R and the vector u G . 

3.4. Max-stable process on T. In Sections 3.2 and 3.3 we defined two 
negative definite kernels on the river network T: Friv models the extremal 
dependence of flow-connected stations due to the specific hydrological prop¬ 
erties of the river network, and Feuc describes additional dependence be¬ 
tween all stations due to the geographical structure of the river basin and 
spatially distributed precipitation. We combine these to obtain our hnal 
dependence model: given weights Ariv) Aeuc > 0, we put 

F(tj,tj) — ARivFRiv(tj, tj) -|- AeucEeuc( fiTj) 
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( 19 ) 

ARivIl- + 

_ ^ ^ keBij 

+ AeucII-R • H{ti) — R ■ H{tj)\\°‘, ti -H- tj, 

. Ariv + AeucII-R • H{ti) - R ■ ti tj, 

for any ti,tj G T. By Remark 2.2 we can define a Gaussian random field W 
on T with variogram F, and by Theorem 2.1 we obtain a max-stable process 
ijr on T, defined in (10), with dependence function F. The process r/r is 
nonstationary: indeed, since it is not defined on a Euclidean space, even the 
notion of stationarity is unclear. 

The process i]r has standard Frechet margins. However, even after nor¬ 
malization of the data with scale and location parameters at each location 
t G T as in (1), the univariate tail distributions will have different shapes. We 
must therefore transform the standard Frechet margins in (10) to GEVD. 
We set 


( 20 ) 


Vr{t) 


rir{t)^d) — 1 

W) 


teT, 


where ^{t) G M is the shape parameter at point t G T. It is then easily verified 
that the margins of follow a GEVD, that is, 

P{r7r(i) <x} = exp[—{1 -|- x G M. 


4. Inference. 


4.1. General. Inference for the extremes of univariate data is well devel¬ 
oped [Goles (2001), de Haan and Ferreira (2006), Embrechts, Kliippelberg 
and Mikosch (1997)], so we merely sketch it in Section 4.2. Statistical infer¬ 
ence for multivariate or spatial models is more difficult, as their distributions 
are rarely known in closed form or involve high-dimensional integration. 
Gomposite likelihood methods based on bivariate densities have therefore 
been widely applied [Padoan, Ribatet and Sisson (2010), Davison and Gho- 
lamrezaee (2012), Huser and Davison (2014)]. Recent research has focused 
on methods that exploit full likelihoods of multivariate extreme observations 
through peaks-over-threshold approaches [Wadsworth and Tawn (2014), En- 
gelke et al. (2015), Thibaud and Opitz (2015), Bienveniie and Robert (2014)] 
and on M-estimators for spatial extremes [Einmahl et al. (2015)]. However, 
different definitions of an extreme event yield different inferences. One might 
call a multivariate observation extreme if at least one component is large, 
leading to multivariate generalized Pareto distributions [Rootzen and Taj- 
vidi (2006)], whereas choosing data where a single fixed component exceeds a 
high threshold gives a conditional extreme value model [Heffernan and Tawn 




14 


P. ASADI, A. C. DAVISON AND S. ENGELKE 


(2004)], and spectral estimation is based on observations where a suitable 
norm of the components is large [cf. Coles and Tawn (1991)]. For hnite 
samples each choice has advantages and disadvantages [Huser, Davison and 
Genton (2014)]. 

We consider two estimation procedures tailor-made for a max-stable pro¬ 
cess 77r whose finite-dimensional margins follow the Hiisler-Reiss distribu¬ 
tion (14). Engelke et al. (2015) compute the spectral density of the exponent 
measure (14) and introduce an estimator for the parameters of a Brown- 
Resnick process [Kabluchko, Schlather and de Haan (2009)]. Wadsworth 
and Tawn (2014) use events for which at least one component exceeds a 
high threshold, and censor any components that stay below it. 

In Section 4.3 we review these two methods, show how they can be 
adapted to our framework, and derive a new representation of the condi¬ 
tional densities, simpler than that in Wadsworth and Tawn (2014). Asadi, 
Davison and Engelke (2015) describe a small simulation study that aids in 
the choice of estimator for our application. 

4.2. Univariate margins. We must estimate the univariate extreme value 
parameters, that is, the norming constants bj,n-, and the shape param¬ 
eter (j = 1,... ,m) in (1). This allows the calculation of univariate return 
levels at each location and is needed for the transformations Uj^n in (3) that 
appear in the multivariate exceedance probabilities (7). We use the Poisson 
point process approach [Coles (2001), Section 7.3] to ht these models for the 
univariate exceedances. 

Recall that X* = ..., Xra,i) (i = 1,..., n) are independent copies of 

an m-variate random vector X as in Section 2.1. Eor each location j = 
1,..., m, let Qj^p be the empirical p-quantile, with p « 1, of the data Xj^i ,..., 
Xj^n, and write Ij = {z G {1,..., n} : Xj^i > qj^p}. Then the Poisson point 
process likelihood for the exceedances at station tj, assumed independent, 
can be written as [Coles (2001), (7.9)] 



( 21 ) 



where nj is the number of years of observations at location tj. Owing to the 
inclusion of nj, the parameters aj^^bj^n and equal those in the GEVD 
(1) for yearly maxima. A joint model for the parameters at different loca¬ 
tions, such as a linear model with environmental covariates, can be fitted by 
maximizing a so-called independence likelihood [Chandler and Bate (2007)] 
based on the product of (21) over all stations. 
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4.3. Estimation ofijr. In order to fit the max-stable process r/r intro¬ 
duced in Section 3 with dependence kernel (19), we must estimate the six 
parameters 

-^Riv ^ 0) -^Euc ^0) T > 0, 

(22) 

aG(0,2], /3 G [7r/4,37r/4], c > 0, 

that characterize the river and Euclidean dependence functions E^iv and 
Eeuc and their weights. Below we write i? = (Ariv, Aeuc, t, a, f5, c), and denote 
the corresponding parameter space by 0. When stressing that E depends on 
the parameter iH, we write E = E^. 

We do not observe data from the asymptotic limit model ryp itself, so let 
us specify the assumptions for our observations. As in Section 3, let T denote 
the river network and assume that we have n observations Xi,..., X„ G M™' 
at m locations t = (ti,..., tm) G T™. Further, suppose that the data are nor¬ 
malized to standard Pareto margins with cumulative distribution function 
1 — 1/x (x > 1) and that the vectors X^ (A: = 1,..., n) are independent copies 
of a random vector X in the max-domain of attraction of the max-stable 
process ryr(t) = (^r(Ai); • • • ;??r(Am))- This means that 

(23) lim nP(X/nG A) = Ert(^), 

n—>-oo ’ 

for any Borel subset A<Z E which is bounded away from 0 and which has 
zero Ep^t measure on its boundary; recall the dehnition of the exponent 
measure in Section 2.1. 

4.3.1. Spectral estimation o/E^. The random vector r/r(t) follows a mul¬ 
tivariate Hiisler-Reiss distribution. Even though its multivariate densities 
are not available, the densities of its exponent measure IE,t have closed 
forms for any dimensions and we can apply the spectral estimator proposed 
by Engelke et al. (2015). Indeed, for large thresholds m > 0 the convergence 
in (23) justifies the approximation 

dm 

(24) P(X G dx, ||X||i -^^Er,t(a:i,---,Tm)dx, 

where ||x||i = ^ -®) denotes the Li-norm, and Er,t({x G E : 

||x||i > 1}) = m. Owing to the homogeneity of the exponent measure Vp^t 
in Section 2.1, it suffices to specify the angular part of (24), namely, its 
spectral density on the positive Lp-sphere Sm-i = {x > 0 : ||x||i = 1} C 
[Coles and Tawn (1991)]. Engelke et al. (2015) showed that the spectral 
density of the Hiisler-Reiss exponent measure is 

G Sm-l, 
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where cj = (log(wj/a;i) + rij(ij,ii)/2 : j = 2,... and C i)x(m i) 
is the covariance matrix from Remark 2.2(a) for u = ti, that is, 

(25) 

Thus, denoting the index set of extremal observations by X = {A: = 1,..., re : 
||Xfc 111 > re}, the spectral estimator ??spec of ^ is dehned by 

(26) i?sPEC = argmaxy^log5,j(Xfc/||Xfc||i). 

kei 

The advantage of this estimator over composite likelihood counterparts is 
that it uses a full likelihood and thus is fully efficient, thus giving improved 
estimation of Brown-Resnick processes; see the simulation study in Engelke 
et al. (2015). Owing to the explicit form of the spectral densities, this ap¬ 
proach is feasible even for a large number rre of locations. 

4.3.2. Censored estimation o/T^. Conditioning on the norm of obser¬ 
vations being large, as in (24), might introduce bias, since the limit distri¬ 
bution may provide a poor density approximation to any of the X^ that 
have small individual components. To overcome this, Wadsworth and Tawn 
(2014) apply censoring to those components that do not exceed a fixed high 
threshold. We adopt their approach, giving a new, simpler expression for 
the censored likelihood, valid for any process with Hiisler-Reiss margins, 
not just for stationary Brown-Resnick processes. 

Similarly to the spectral estimation based on (24), for large thresholds 
re > 0 we have the approximation 

/ \ 

(27) P Xedx, max Xj>u)rii-- --— Vr,t{xi, ■ ■ ■ ,Xm) dx. 

\ j — 

Here, a multivariate observation is said to be extreme if at least one compo¬ 
nent exceeds the threshold. For the likelihood contribution from an obser¬ 
vation X = (Xi,..., Xm) we distinguish two cases: 

• if at least one component exceeds the threshold, that is, Xj > re for all 
j G JC and Xj < re for all j G = {1,... , rre} \ /C for a nonempty sub¬ 
set X C {l,...,rre}, we compute the likelihood /,j^^(X) by censoring all 
/C^-components of the full likelihood /i?,i:m(X)- We thus only use the 
information that those components are below the threshold re, but not 
their exact values. Without loss of generality, let /C = {1,..., 6}, for some 
6 G {1,..., rre}. Then the censored likelihood is 

Qb 

/i?,Ac(x) = _ _g^^ Rr,t(a:i,... ,Xfe,re,..., re) 

1 

xfx2 ■ --Xb 


(28) 


4>b-l{^2:b] ^2-.b,2:b)^m-b{fJ'C', ^c), 
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where S = S,? is the covariance matrix in (25), x = (logXj — logxi + 
r,^(tj,ti)/2 ■. j = G M”*, and and denote the 

density and the cumulative distribution function of a p-dimensional, zero- 
mean normal distribution with covariance matrix 'h. We set (/)o to 1 if 
6 = 1, and $o to 1 if 6 = m. The conditional mean ^md covariance 
matrix Sc are 

(29) HC = (logu - logXl + r^(ti,tl)/2)j=;,+ c 

(30) Sc ^(fe+l):m,(fe+l):m ^(fe+l):m,2:fe^2:6,2:fe^2:&,(fe+l):m' 

In the case 6=1, and Sc are unconditional, that is, the last summands 
in the formulas above vanish. The derivation of this new representation 
of /i?,a: can be found in Asadi, Davison and Engelke (2015). 

• if none of the components exceeds u, that is, JC = 0, then the likelihood 
contribution is just the probability f^^jci'x.) = 1 — I/r,t(u) that X lies en¬ 
tirely below the threshold. 

Let = {i = 1,... ,n : maxfc=i^...^m> u} denote the index set of ob¬ 
servations extreme in the sense of (27) and, for each i G J', let /Cj be the 
index set of those components of Xj that exceed u. Then, the censored es¬ 
timator i9cens is obtained by maximizing the log-likelihood [Thibaud and 
Opitz (2015), Section 3] 

(re - |J|)log{l - Er,t(u)} -F^log/,?,K;i(Xi) . 

iGj 

This estimator has the advantage of using full likelihoods and reducing po¬ 
tential bias by censoring components that might not yet have converged, 
but the disadvantage of being slow when rre is large, since the censored like¬ 
lihood then involves the burdensome evaluation of high-dimensional 
normal distribution functions. 

4.3.3. Simulation study. The two estimators iIspec and ??cens use dif¬ 
ferent data and will have different behavior for finite sample sizes. We con¬ 
ducted a small simulation study to assess their performance in a setting 
similar to our application. Details can be found in Asadi, Davison and En¬ 
gelke (2015). Both estimation procedures work for the simulated data, even 
with a low number of observations; only the extreme events contribute to 
the likelihoods. In simulated data, the advantage of censoring cannot be 
seen, but it will reduce any bias for real data. As also noted by Engelke 
et al. (2015) and Einmahl et al. (2015), the estimates of Aeuc have larger 
variation than the others. In fact, owing to a near-functional relationship 
between the scale Aeuc and the shape a of the fractal variogram, these two 
parameters are strongly related in the range considered here, and this near 
lack of identihability gives highly variable estimators of Aeuc- 


(31) ilcENS = argmax 
i?Ge 
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5. Extreme river discharges in the upper Danube basin. 

5.1. Data. We used data for average daily discharges recorded at m = 31 
German gauging stations on 20 rivers in the upper Danube basin, made avail¬ 
able by the Bavarian Environmental Agency (http: //www. gkd. bayern. de). 
The average discharges at these stations range from around 20 m^/s at high 
altitudes to around 1400m^/s at the most downstream station. The major 
part of the runoff in the basin arises from the Alps, situated south of the 
Danube; see Figure 1. The series at individual stations have lengths from 50 
to 130 years, with 50 years of data for all stations from 1960-2009. Origi¬ 
nally, data were provided for 47 stations, but we excluded 16 stations which 
have very small discharges or whose largest discharges are affected by hy¬ 
droelectric installations or dampened by big lakes; it might be possible to 
include these data by applying special preprocessing techniques, but we have 
not explored this. 

Exploratory analysis shows that around one-half of the annual maxima 
in the basin occur in June, July and August. This agrees with the study of 
floods in the Danube tributaries Lech and Isar by Bohm and Wetzel (2006), 
which shows that nearly all major floods in recent decades have occurred 
in these three months; floods in this area are typically caused by heavy 
summer rain. In order to eliminate temporal nonstationarities and the effect 
of snow melt, we restrict our analysis to these months. For k = 1,..., N , we 
let Yfc = (Yi^fc,... ,Ym,k) denote the daily mean discharge at the m stations 
on day k. The number of common measurements at all stations is thus 
Y = 50 X 92 = 4600, that is, 50 years of 92 daily observations in the summer 
months. 

Seasonality and overall trend are the main sources of nonstationarity in 
river flow data, but as we use only the summer month discharges, the sea¬ 
sonality becomes negligible. National studies have concluded that there are 
no significant trends in the extremes of stream flows in our area of interest 
[Katz, Parlange and Naveau (2002), Kundzewicz et al. (2005)], in agreement 
with our exploratory analysis, so henceforth we treat our data as temporally 
stationary. 

In addition to the time series of daily average discharges, we use a digital 
elevation model to obtain the following geographical covariates at each sta¬ 
tion: the latitude and longitude of both the station itself and the weighted 
centroid of its sub-catchment, and catchment attributes including its size, 
mean altitude and mean slope. 

5.2. Declustering. Extreme discharges at a given station occur in clusters 
due to temporal dependence, which must be removed for spatial modeling. 
Moreover, a large value at an upstream station may cause a peak further 
downstream a day or two later. These slightly shifted maximum values on 
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Fig. 5. Declustered flood events at four gauging stations. The grey hatched areas are the 
p-day time windows around flood events. Only events for which at least one river exceeds 
its 90% quantile (dotted horizontal lines) are shown. The black circles show maxima for 
each river in each window. 

different rivers stem from the same event and should be treated as depen¬ 
dent. In the framework of meteorology, multivariate declustering is used by 
Tawn (1988), Coles and Tawn (1991) and Palutikof et al. (1999) to extract 
independent “storm events.” We apply a similar technique to obtain a set 
of independent flood events Xi,..., X„ G on the river network from the 
full time series (A; = 1,..., N). 

In order to extract the flood events, we first identify nonoverlapping win¬ 
dows of length p days in each of the 50 summer periods. We replace each 
observation by its rank within its series, and then consider the day with 
the highest rank across all series, choosing this day randomly if it is not 
unique. We then take a window of p days centered upon the chosen day, 
and form an event by taking the largest observation for each series within 
this window. We delete the data in this window and then repeat the process 
of forming events, stopping when no windows of p consecutive days remain. 
Figure 5 illustrates this declustering procedure. In agreement with Kallache 
et al. (2010), our data suggest that flood events last no longer than 9 days, 
so we put p = 9; a sensitivity analysis showed that our results are robust to 
this choice. For the fth time window, the corresponding flood event X* is the 
m-dimensional vector whose jth entry is the maximum discharge value at lo¬ 
cation tj within this window. This procedure yields a declustered time series 
of n = 428 supposedly independent events Xj from the N = 4600 summer 
measurements common to the 31 series. 

5.3. Marginal fitting. Before using the techniques from Section 4.3 to fit 
the multivariate dependence model, we assess the univariate tail behavior 
at individual gauging stations, obtaining the constants aj^n,bj,n and shape 
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parameters that allow us to normalize the margins to lie in the standard 
Frechet max-domain of attraction, using (3). The model fj-p in (20) is a 
max-stable stochastic process on the whole river network T, so in order to 
make predictions throughout T, we must allow the norming constants and 
shape parameters to vary with covariates that are easily obtainable even 
at locations without gauging stations or find some other way to extend the 
model to the entire network, such as kriging. 

We fitted a generalized extreme value distribution (2) to the tail of the 
declustered daily discharges at each gauging station location tj, estimating 
the extreme value parameters aj^n, bj^n and ^j. At each location we tested 
whether the extremal behavior from any available earlier data changed rel¬ 
ative to the 50 common years. In almost all cases there was no such change, 
and we could use the longer series of independent events, declustered using 
the procedure of Section 5.2, for each station. For the marginal htting we 
use the independent events at gauging stations and estimate the GEV pa¬ 
rameters by maximizing the joint Poisson process likelihood given in (21) in 
an independence likelihood [Chandler and Bate (2007)]. 

We htted and compared a variety of different models using this technique, 
finally settling on a version of regional analysis, as widely used in hydro- 
logical applications. The idea is similar to the regionalization method of 
Merz and Bloschl (2005), who predict high quantiles of river flows using the 
catchment attributes of stations that are “hydrologically” close. Exploratory 
analysis suggests that for our purposes the upper Danube basin can be split 
into four disjoint regions: R1 contains eight stations in the southwest of the 
upper Danube basin and has mid-altitude sub-catchments; R2 comprises five 
stations in the Inn basin that are fed by precipitation in high-altitude alpine 
regions; R3 contains 13 stations in the center of the Danube basin that are 
fed by precipitation from regions with both high and low altitudes; and 
R4 contains hve stations with sources north of the Danube. With Ji, ..., J4 
denoting the index sets of stations in regions Ri,..., R^, we let for j € R 
(i = l,...,4) 

4 

log(aj,n) = log(Pj,fc), 

k=l 
4 

log(6j,n) = log(Pj-fc), 

k=l 

where Rjj, ■ ■ ■ ,Pj ,4 are the latitude of the centroid, the size, the mean alti¬ 
tude and the mean slope of the sub-catchment of gauging station j. Likeli¬ 
hood ratio statistics were used to further simplify the model, finally yielding 
a model with 28 parameters, compared to 93 = 3 x 31 parameters in the 
full model. Diagnostic plots indicate a very satisfactory ht of the simpler 


(32) 
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Fig. 6. 100-year return levels for river flow (mfl/s), extrapolated to the entire network 
T; the colors of the points indicate the return levels at the 31 numbered gauging stations. 

model, which is also strongly favored by the AIC. The estimated shape pa¬ 
rameters and their standard errors for the four regions are 0.030 (0.025), 
0.145 (0.034), 0.028 (0.022) and 0.294 (0.045), suggesting that catchments 
influenced by mountain regions tend to have heavier-tailed responses. 

This model allows the extrapolation of the marginal fit to ungauged lo¬ 
cations on the network T, thereby enabling computation of return levels 
throughout T; see Figure 6. More details are given in Asadi, Davison and 
Engelke (2015). 

5.4. Joint fitting. The generalized extreme value distributions constitute 
all possible limits for univariate maxima, but the dependence structure of 
multivariate extremes is infinite-dimensional, so we must first check that the 
extreme discharges at different stations on the river network are asymptoti¬ 
cally dependent; if not, max-stable processes would not be suitable models. 
Keef, Svensson and Tawn (2009) note that the spatial dependence of ex¬ 
treme river flows is much stronger than that of precipitation data, since the 
former averages the latter and thus is less vulnerable to small-scale vari¬ 
ation, and standard diagnostics [Coles, Heffernan and Tawn (1999)] show 
strong extremal dependence between all 31 stations in our data. Moreover, 
Figure 7 shows bivariate scatter plots of two flow-connected and two flow- 
unconnected stations. In both cases, the assumption of asymptotic depen¬ 
dence seems appropriate and, moreover, a symmetric model for the tail de¬ 
pendence can be justified. 

The choice of a parametric subclass within the asymptotic dependence 
models must be a good approximation to the infinite-dimensional struc¬ 
ture of multivariate max-stable distributions. Theorem 17 in Kabluchko, 
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Eig. 7. Scatter plots of declustered discharges (normalized to the unit Frechet scale) of 
two flow-connected stations (left) and two flow-unconnected stations (right). 


Schlather and de Haan (2009) gives some justification for the fitting of 
Hiisler-Reiss distributions and Brown-Resnick type processes, which are 
essentially the only possible limits of pointwise maxima of suitably rescaled 
and normalized, independent, stationary Gaussian processes. 

In order to assess whether the Hiisler-Reiss distribution approximates the 
extremal dependence of our data well, we estimate the extremal coefficient 6 
as in (8) for each pair of locations using the madogram [Cooley, Naveau and 
Poncet (2006)] based on summer maxima. We then fit the bivariate Hiisler- 
Reiss distribution (12) to these data by a censored peaks-over-threshold 
approach and use (13) to compute a model-based extremal coefficient esti¬ 
mate 0 hr- The left panel of Figure 8 suggests that the Hiisler-Reiss model 



Eig. 8. Comparison of empirieal estimates of extremal eoeff dents found nonparametri- 
cally using the madogram and those implied by different models, for all pairs of gauging 
stations. Left: madogram-based estimates and extremal coefficients 6hr of the Hiisler-Reiss 
model, estimated by fitting to independent events. Center: estimates using Es plotted 
against hydrological distance. Right: madogram-based estimates and those from fitted joint 
model Fa. Those for flow-connected pairs are blue crosses, and those for flow-unconneeted 
pairs are black circles. 
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provides an excellent overall approximation to the bivariate extremal depen¬ 
dence structure of the discharge data, albeit with slight overestimation of 
dependence at longer distances for flow-unconnected pairs. 

We compare four overall models for the dependence kernel F: 

• the stationary variogram based on Euclidean distances with anisotropy 
matrix R as in (18), 

ri(s,f) = A||i? • (s — f)||“, A > 0,a e (0,2],/3 e [vr/d, 37 r/ 4 ],c > 0; 

• a variogram using the transformation H to hydrological locations, 

T2{s,t) = X\\R-{H{s)-H{t)}r, 

A > 0,a G (0,2],/3 G [tt/F, 37 r/ 4 ], c > 0; 

• a variogram that includes the hydrological properties of the river network 
for flow-connected locations, corresponding to (19), 

rsisR) = ARwrRiv(s,t) + AeucP • {Hit) - R'(s)}r, 

whose six parameters are given in ( 22 ); hnally, 

• we also consider the previous model without anisotropy, 

r4{s,t) = ARivrRiv(s,f) -I- AEuc||.ff(i) - -ff('S)ir, 

■^Riv) •^Euc ^ > 0,0 G (0,2]. 

The weights in FRiv are computed according to (16) using a digital elevation 
model. 

In Section 5.2 we extracted n = 428 independent multivariate flood events 
Xi,..., X„, whose univariate extremal behavior was analyzed in Section 5.3. 
In order to fit the multivariate dependence structure, we use the marginal 
empirical distribution functions to transform the distribution at each gaug¬ 
ing station to standard Pareto, and denote the resulting data by Xi,..., X„. 
We fit the functions Fi,..., r 4 for the negative definite kernel in ryr to these 
data using the inference procedures described in Section 4.3, first obtaining 
the spectral estimate iIspec in (26) by grid search on the parameter space 
0 , and then using this as an initial value for the more demanding computa¬ 
tion of the censored estimate 'ilcENS in (31). It would be preferable to fit the 
univariate margins and the dependence structure simultaneously, but here 
this is infeasible since the optimization for the dependence structure is very 
time intensive. 

The maximized log-likelihoods corresponding to Fi,... ,F 4 are —6629.17, 
—6161.86, —5907.49 and —5915.97; F 3 has six parameters, and the others all 
have four parameters. The use of hydrological distances for F 2 ,F 3 ,F 4 gives 
a huge improvement over the use of Euclidean distances in Fi, and adding 
the component Frr for flow-connected dependence means that F 3 is much 
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better than r 2 . The drop from Ta to r 4 shows that the anisotropy matrix 
R also contributes to the good fit of the model based on Ts. 

The center and right panels of Figure 8 (recall also the right panel of 
Figure 2) compare the extremal coefficients obtained with the madogram 
and those implied by the fitted model F 3 . The center panel shows that 
the latter do not lie on a smooth curve; flow-connected pairs at the same 
distance can have different extremal coefficients, depending on where the 
two stations lie on the network, because the river dependence kernel Fpiv is 
nonstationary, unlike those based on simple meteorology. Overall there is a 
fairly good fit, though the model tends to slightly understate dependence at 
short hydrological distances and to overstate it at long ones. 

The parameter estimates "dcENS are Ariv = 0.73 (0.07), Aeuc = 1-93 x 
10-‘^(0.75 X lO-"^), f = 839 (280) km, d = 1.75 (0.08), /3 = 1.10 (0.11) and 
c = 0.64 (0.08), with standard errors in parentheses obtained from 100 non- 
parametric bootstrap simulations. The high uncertainty for Aeuc was men¬ 
tioned when discussing the simulation study; it does not translate into high 
variation of the fitted model. 

The fitted weights Ariv and Aeuc cannot be compared directly, because 
the variogram Teuc is unbounded and thus does not have a natural normal¬ 
ization. The influences of the river and the Euclidean dependence kernel on 
the overall extremal dependence between two flow-connected points s,t£T 
can be measured by rRiv(s,t)/r 3 (s,t) and rEuc('S,t)/r 3 (s,t), respectively. 
In fact, for certain pairs of stations the river dependence kernel is dominant, 
whereas for others the Euclidean kernel has a stronger influence on the ex¬ 
tremal dependence. The parameter f is the scale for dependence along the 
river; as expected, this dependence is very strong, decreasing to zero only 
after f = 839 km. The shape parameter a describes how local the influ¬ 
ence of spatial meteorological events on river flows is; note that a = 1.75 is 
much larger than in applications on extreme precipitation, confirming the 
observation of Keef, Svensson and Tawn (2009) that extreme river flows 
exhibit stronger spatial dependence due to an averaging effect. The parame¬ 
ters /3 and c describe the anisotropy of meteorological dependence, since the 
transformation R{l3,c) dilates the space in direction (sin /3, cos /3) by c. As 
c < 1 , extremal dependence is increased in this direction, which corresponds 
approximately to the planar vector (2,1). Thus, in terms of hydrological 
distance, two stations that are 64 km apart in a direction roughly parallel to 
the Alps have the same dependence as two stations that are 100 km apart 
perpendicular to the Alps. In view of the orientations of the catchments and 
the blocking effect that the Alps have on weather systems, this seems quite 
plausible. 

5.5. Higher-order properties. Eigure 8 shows how the max-stable model 
r/Eg fits the bivariate extremal features of the data. In practice, higher-order 
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Fig. 9. QQ-plots (Gumbel scale) of observed groupwise yearly maxima and theoretical 
values from the fitted model, for groups of 3 (top left), 5 (top right), 15 (bottom left) and all 
31 (bottom right) stations. Dashed lines and dotted lines correspond to values for complete 
independence and complete dependence, respectively, and the solid line corresponds to the 
fitted model. 


properties such as multivariate exceedance probabilities are also of interest, 
and to check these we randomly choose groups of 3, 10, 15 and 31 sta¬ 
tions and compute the quantiles of their observed group maxima, suitably 
rescaled [cf. Davison and Gholamrezaee (2012)]. Figure 9, which compares 
these quantiles with the theoretical values derived from the fitted model, 
shows that the model captures even high order structures of the data very 
well. Moreover, the comparison of observed quantiles to those corresponding 
to complete independence and complete dependence underlines the impor¬ 
tance of proper dependence modeling. 

A joint extremal model allows the estimation of the risk of simultane¬ 
ous exceedances of high thresholds at multiple locations. More precisely, 
we can use equation (7) to approximate these probabilities as a function of 
the univariate extreme value parameters and the exponent measure V of 
the dependence model. For three stations t = {ti,t 2 ,ts) G T^, the exponent 
measure for our model is Fr,t as in (14). Let qj^p be the p-quantile of the 
distribution of daily discharges at station tj. The probability of a flood that 
exceeds the respective p-quantiles at all three stations in the same summer 
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can be approximated by 

K¥‘[X(tj) > qj,p',j = 1,2,3) 

(33) 


where K is the mean number of multivariate events per year. The estimates 
for the shape and scale parameters are taken from the htted covariate model 
in (32), so this multivariate exceedance probability, and others for more 
complex events, can be computed for any locations, even ungauged, on the 
river network. To compare the model with empirical data, we randomly 
choose 500 out of the (^ 3 ^) possible triplets of gauging stations and evaluate 
(33) for different values of p close to 1. The mean relative absolute differences 
of these model probabilities and their empirical counterparts are 15% for 
p = 0.95, 14% for p = 0.97, 19% for p = 0.99, and 31% for p = 0.995; the 
empirical counterparts are highly variable, since they are based on very few 
events. 

6. Discussion. The approach described above was used to fit other max- 
stable processes, such as the extremal-t or Schlather models, but we found 
that the Brown-Resnick model was the best of those fitted; perhaps this is 
not surprising, since this model is flexible and allows independent extremes 
at long distances, unlike the Schlather model, for example. 

Keef, Svensson and Tawn (2009), Keef, Tawn and Svensson (2009), Keef, 
Tawn and Lamb (2013) describe an alternative approach to modeling joint 
flooding that allows the possibility of asymptotically independent extremes 
through the fitting of the Heffernan and Tawn (2004) model. This can han¬ 
dle large-scale problems, but has the drawback of not treating the variables 
symmetrically, and it is not clear whether it corresponds to a well-defined 
joint model. In those papers, it is important to allow for asymptotic inde¬ 
pendence because the data arise from rivers that may be quite unrelated, 
whereas stronger dependence might be anticipated in a single river network, 
as in the present paper. Moreover, our approach uses the known structure 
of the river networks, which should provide better dependence modeling. 

Finally, the ideas suggested here might be extended to similar problems for 
which Euclidean geometry does not seem natural, such as the transmission 
of earthquake shocks along fault lines, or communication networks, though 
it would then be important to allow for flows in different directions. In some 
applications it might be useful to include the relative timings of extremes at 
different nodes of the network. 

Acknowledgments. We thank Jonathan Tawn, Hansjoerg Albrecher, Mar¬ 
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SUPPLEMENTARY MATERIAL 

Supplement to “Extremes on river networks” 

(DOI: 10.1214/15-AOAS863SUPP; .pdf). The supplementary material con¬ 
tains the following: a PDF document containing the derivation of the new 
likelihood representation mentioned in Section 4.3.2, results of the simula¬ 
tion study mentioned in Section 4.3.3, and additional details germane to 
Section 5.3; and R code and data files to reproduce the data analysis and 
figures. 
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